Binary adaptive semi-global matching based on image edges

نویسندگان

  • Han Hu
  • Yuri Rzhanov
  • Philip J. Hatcher
  • R. Daniel Bergeron
چکیده

Image-based modeling and rendering is currently one of the most challenging topics in Computer Vision and Photogrammetry. The key issue here is building a set of dense correspondence points between two images, namely dense matching or stereo matching. Among all dense matching algorithms, Semi-Global Matching (SGM) is arguably one of the most promising algorithms for real-time stereo vision. Compared with global matching algorithms, SGM aggregates matching cost from several (eight or sixteen) directions rather than only the epipolar line using Dynamic Programming (DP). Thus, SGM eliminates the classical “streaking problem” and greatly improves its accuracy and efficiency. In this paper, we aim at further improvement of SGM accuracy without increasing the computational cost. We propose setting the penalty parameters adaptively according to image edges extracted by edge detectors. We have carried out experiments on the standard Middlebury stereo dataset and evaluated the performance of our modified method with the ground truth. The results have shown a noticeable accuracy improvement compared with the results using fixed penalty parameters while the runtime computational cost was not increased. Keyword: Semi-global matching, dense matching, computer vision, 3D reconstruction, canny edges. 1. BACKGROUND Depth information in our environment has a wide range of applications, such as land surveying, driverless assistance system and indoor navigation, etc. The depth information can be estimated through the dense matching procedure applied to two images from a stereo camera system. The dense matching is the most crucial step in the processing pipeline. Current dense matching algorithms can be basically divided into two categories: local algorithms and global algorithms. Local methods compare correspondence one point at a time, without consideration of neighboring points/measures, while global methods seek a disparity assignment that minimizes a global cost function which typically includes a data term and a smoothness term. Local methods are much faster than global methods but they usually suffer from a lack of smoothness in the final disparity map. Semi-Global Matching (SGM) as proposed by Hirschmuller [1, 2] combines the advantages of the above two methods with lower computational complexity for real-time needs given limited hardware resources and is able to achieve high precision depth estimation. Currently it is one of the most advanced and efficient dense matching algorithms which has proved to be successful in DSM generation [3] and driver assistance systems [4]. Two major research directions are being carried out in further development of this algorithm. The first direction is the optimization and acceleration of implementing SGM on different hardware architectures. This type of research focuses on the algorithm implementation on Graphics Processing Units (GPU) [5, 6] and on seeking efficiency improvement on the CPUs [7, 8]. Another research direction concentrates on improvement and evaluation of SGM regarding its accuracy and computational complexity and memory requirements. Within [1] a hierarchical approach using image pyramid was proposed to initialize and refine matching cost. The disparity of the higher level pyramid is used to refine the matching cost calculation for the lower level in order to accelerate convergence speed for higher levels. In [9] the accuracy of four different penalty functions in the cost aggregation step under two different types of matching cost calculation has been evaluated. Hirschmuller et al. [10] experimented with different cost calculation methods in three different stereo algorithms and concluded that hierarchical mutual information performed best for pixel-based global matching methods like SGM. Michael et al. [11] proposed using individual adaptive penalties for different path orientations where each path has its own weight and four penalty parameters which depend on intensity gradients (no edge selection). A large amount of data has to be considered for tuning such high numbers of parameters. In this paper, we propose an adaptive way for adjusting penalty parameters based on image edges for the reality that the image edges normally indicates disparity discontinuity. For the experiments, we consider the well-known Middlebury benchmark dataset [12] and evaluate the performance of our proposed modification by comparing the results with the ground truth. The structure of this paper is organized as follows. The review of the original Semi-Global Matching algorithm is given in Section 2. We then shortly introduce and explain the edge-based SGM algorithm and present the experimental results and their evaluation in Section 3. The conclusion of our work and further improvements in the near future are briefly introduced in Section 4. 2. SEMI-GLOBAL MATCHING 2.1. Matching Cost Calculation with Mutual Information Pixel Mutual Information (MI) is considered to be insensitive to recording and illumination changes [1]. The original SGM method uses MI as its pixel matching cost. It has been found out by Hirschmuller et al. [10] that the mutual information has better performance for most cases with SGM compared with other matching cost calculation methods like Birchfield and Tomasi (BT) interpolation [13]. MI comes from the theory of signal processing and is defined by the entropies H of the input two images I1 and I2 and their joint entropy HI1,I2: MII1,I2 = HI1 + HI2 − HI1,I2 (1) The entropies are calculated from the image intensity probability distribution P: HI = − ∫ PI(i)logPI(i)di 1 0 (2) HI1,I2 = − ∫ ∫ PI1,I2(i1, i2) 1 0 logPI1,I2(i1, i2)di1di2 1 0 (3) Kim et al. [14] transformed the entropy calculation into discrete space using Taylor expansion. As a result, the joint entropy is calculated as a sum of data terms that depend on corresponding intensities of a pixel p: HI1,I2 = ∑ hI1,I2(I1p, I2p) p (4) hI1,I2(I1p , I2p) = − 1 n log (PI1,I2(i, k)⨂g(i, k)) ⊗ g(i, k) (5) The single image entropy is calculated by the following equation HI = ∑ hI(Ip) p (6) hI(i) = − 1 n log(PI(i) ⊗ g(i)) ⊗ g(i) (7) Where P is the intensity distribution, n is the number of total correspondences and g denotes Gaussian convolution. The resulting definition of MI is hence MII1,I2 = ∑ miI1,I2(I1p,I2p) p (8) miI1,I2(I1p,I2p) = hI1(i) + hI2(i) − hI1,I2(i, k) (9) Therefore, the matching cost based on MI is defined as CMI(p, d) = −miI1,I2(Ibp,Imq) (10) Where q is the corresponding pixel in match image of p in base image with disparity d q = ebm(p, d) (11) Since the calculation of the joint intensity distribution requires an initial disparity map to warp the match image towards the base image, the SGM uses an iterative computation strategy where the initial disparity map is assigned randomly. 2.2. Cost Aggregation Traditionally, the 1D energy E(D) of a disparity map D is calculated using the following equation E(D) = ∑ (C(p, Dp) + ∑ P1T[|Dp − Dq| = 1] q∈Np + ∑ P2T[|Dp − Dq| > 1] p∈Np ) p (12) The first data term is the sum of matching cost for all pixels p. The second term adds a constant penalty P1 for all the neighboring pixels q of pixel p if the disparity of q is different from the disparity of p by 1. The third data term adds a larger constant value P2 for all the neighboring pixels q if the disparity difference between p and q is larger than 1. The problem of stereo matching is formulated as a problem of finding the disparity image D that minimizes the energy function E(D). This global minimization problem is NP-complete and can be efficiently solved using Dynamic Programming (DP) [15]. However, it is well known that this minimization along separate epipolar lines is causing the “streaking problem” [15] due to independent processing between image rows. SGM solves this problem by aggregating matching cost from many different directions. Figure 1 Cost aggregation. Left: 16 paths cost aggregation at a pixel p in 2D image space. Right: illustration of horizontal path cost structure on a single image row. This is done through summing the costs of all 1D minimum cost paths that end in the pixel p at disparity d as illustrated in Figure 1 (left). The matching cost Lr(p, d) of pixel p at disparity d along one particular path is calculated recursively by the following equation: L_r (p, d) = C(p, d) + min (L_r (p − r, d), Lr(p − r, d − 1) + P1, (13) Lr(p − r, d + 1) + P1, min Lr(p − r, i) + P2) − min Lr(p − r, k) The first term is the matching cost as it is in the energy function E(D). p − r is the previous pixel along the path. The last term is subtracted from the aggregated cost to avoid number overflow and this term is the same for all the disparities of pixel p. The final cost of pixel p at disparity d is the sum of all costs from all paths. S(p, d) = ∑ Lr(p, d) r (14) 2.3. Disparity Computation After computing the matching cost cube, the disparity of a pixel p is determined by selecting the disparity that corresponds to the minimum cost from its disparity search range, that is mindS[p, d]. Hence the disparity image that corresponds to the base image Db is obtained. The disparity image that corresponds to the match image Dm can also be determined from the same costs as well by traversing the epipolar line that corresponds to the pixel q of the match image. For sub-pixel disparity accuracy, a quadratic curve is fitted using neighboring costs next to the disparity that has the minimum cost. The occlusions and false matches can be determined by performing a consistency check between Db and Dm. This consistency check enforces the uniqueness constraint by permitting one to one mapping only. p

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contourlet-Based Edge Extraction for Image Registration

Image registration is a crucial step in most image processing tasks for which the final result is achieved from a combination of various resources. In general, the majority of registration methods consist of the following four steps: feature extraction, feature matching, transform modeling, and finally image resampling. As the accuracy of a registration process is highly dependent to the fe...

متن کامل

Hybrid-based Dense Stereo Matching

Stereo matching generating accurate and dense disparity maps is an indispensable technique for 3D exploitation of imagery in the fields of Computer vision and Photogrammetry. Although numerous solutions and advances have been proposed in the literature, occlusions, disparity discontinuities, sparse texture, image distortion, and illumination changes still lead to problematic issues and await be...

متن کامل

Image Zooming using Non-linear Partial Differential Equation

The main issue in any image zooming techniques is to preserve the structure of the zoomed image. The zoomed image may suffer from the discontinuities in the soft regions and edges; it may contain artifacts, such as image blurring and blocky, and staircase effects. This paper presents a novel image zooming technique using Partial Differential Equations (PDEs). It combines a non-linear Fourth-ord...

متن کامل

DPML-Risk: An Efficient Algorithm for Image Registration

Targets and objects registration and tracking in a sequence of images play an important role in various areas. One of the methods in image registration is feature-based algorithm which is accomplished in two steps. The first step includes finding features of sensed and reference images. In this step, a scale space is used to reduce the sensitivity of detected features to the scale changes. Afterw...

متن کامل

A Nonlinear Grayscale Morphological and Unsupervised method for Human Facial Synthesis Based on an Example Image

Human facial generation of example image is used as a requirement for biometric applications for the purpose of identifying individuals. In this paper, face generation consists of three main steps. In the first step, detection of significant lines and edges of the example image are carried out using nonlinear grayscale morphology. Then, hair areas are identified from the face of sample. The fin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015